AITopics | development team

Collaborating Authors

development team

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

Nie, Yuzhou, Li, Hongwei, Guo, Chengquan, Jiang, Ruizhe, Wang, Zhun, Li, Bo, Song, Dawn, Guo, Wenbo

arXiv.org Artificial IntelligenceDec-9-2025

We propose VulnLLM-R, the~\emph{first specialized reasoning LLM} for vulnerability detection. Our key insight is that LLMs can reason about program states and analyze the potential vulnerabilities, rather than simple pattern matching. This can improve the model's generalizability and prevent learning shortcuts. However, SOTA reasoning LLMs are typically ultra-large, closed-source, or have limited performance in vulnerability detection. To address this, we propose a novel training recipe with specialized data selection, reasoning data generation, reasoning data filtering and correction, and testing-phase optimization. Using our proposed methodology, we train a reasoning model with seven billion parameters. Through extensive experiments on SOTA datasets across Python, C/C++, and Java, we show that VulnLLM-R has superior effectiveness and efficiency than SOTA static analysis tools and both open-source and commercial large reasoning models. We further conduct a detailed ablation study to validate the key designs in our training recipe. Finally, we construct an agent scaffold around our model and show that it outperforms CodeQL and AFL++ in real-world projects. Our agent further discovers a set of zero-day vulnerabilities in actively maintained repositories. This work represents a pioneering effort to enable real-world, project-level vulnerability detection using AI agents powered by specialized reasoning models. The code is available at~\href{https://github.com/ucsb-mlsec/VulnLLM-R}{github}.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2512.07533

Country:

North America > United States > California (0.67)
North America > United States > Illinois (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

All of My Employees Are AI Agents, and So Are My Executives

WIREDNov-12-2025, 11:00:00 GMT

Sam Altman says the one-person billion-dollar company is coming. Maybe I could be that person--if only I could get my colleagues to shut up and stop lying. One day a couple months ago, in the middle of lunch, I glanced at my phone and was puzzled to see my colleague Ash Roy calling. In and of itself it might not have seemed strange to get a call from Ash: He's the CTO and chief product officer of HurumoAI, a startup I cofounded last summer. We were in the middle of a big push to get our software product, an AI agent application, into beta. There was plenty to discuss. "Hey there," he said, when I picked up. He was calling, he said, because I'd requested a progress report on the app from Megan. "I've been good," I said, chewing my grilled cheese.

agent, artificial intelligence, natural language, (15 more...)

WIRED

Country: North America > United States > California (0.28)

Industry:

Information Technology (0.47)
Law (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)

Add feedback

Before the Clinic: Transparent and Operable Design Principles for Healthcare AI

Bakumenko, Alexander, Masino, Aaron J., Hoelscher, Janine

arXiv.org Artificial IntelligenceNov-5-2025

The translation of artificial intelligence (AI) systems into clinical practice requires bridging fundamental gaps between explainable AI theory, clinician expectations, and governance requirements. While conceptual frameworks define what constitutes explainable AI (XAI) and qualitative studies identify clinician needs, little practical guidance exists for development teams to prepare AI systems prior to clinical evaluation. We propose two foundational design principles, Transparent Design and Operable Design, that operationalize pre-clinical technical requirements for healthcare AI. Transparent Design encompasses interpretability and understandability artifacts that enable case-level reasoning and system traceability. Operable Design encompasses calibration, uncertainty, and robustness to ensure reliable, predictable system behavior under real-world conditions. We ground these principles in established XAI frameworks, map them to documented clinician needs, and demonstrate their alignment with emerging governance requirements. This pre-clinical playbook provides actionable guidance for development teams, accelerates the path to clinical evaluation, and establishes a shared vocabulary bridging AI researchers, healthcare practitioners, and regulatory stakeholders. By explicitly scoping what can be built and verified before clinical deployment, we aim to reduce friction in clinical AI translation while remaining cautious about what constitutes validated, deployed explainability.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.01902

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

Impact of LLMs on Team Collaboration in Software Development

Dhanuka, Devang

arXiv.org Artificial IntelligenceOct-13-2025

Large Language Models (LLMs) are increasingly being integrated into software development processes, with the potential to transform team workflows and productivity. This paper investigates how LLMs affect team collaboration throughout the Software Development Life Cycle (SDLC). We reframe and update a prior study with recent developments as of 2025, incorporating new literature and case studies. We outline the problem of collaboration hurdles in SDLC and explore how LLMs can enhance productivity, communication, and decision-making in a team context. Through literature review, industry examples, a team survey, and two case studies, we assess the impact of LLM-assisted tools (such as code generation assistants and AI-powered project management agents) on collaborative software engineering practices. Our findings indicate that LLMs can significantly improve efficiency (by automating repetitive tasks and documentation), enhance communication clarity, and aid cross-functional collaboration, while also introducing new challenges like model limitations and privacy concerns. We discuss these benefits and challenges, present research questions guiding the investigation, evaluate threats to validity, and suggest future research directions including domain-specific model customization, improved integration into development tools, and robust strategies for ensuring trust and security.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.08612

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.89)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Jailbreaking: Auditing Contextual Privacy in LLM Agents

Das, Saswat, Sandler, Jameson, Fioretto, Ferdinando

arXiv.org Artificial IntelligenceSep-30-2025

LLM agents have begun to appear as personal assistants, customer service bots, and clinical aides. While these applications deliver substantial operational benefits, they also require continuous access to sensitive data, which increases the likelihood of unauthorized disclosures. Moreover, these disclosures go beyond mere explicit disclosure, leaving open avenues for gradual manipulation or sidechannel information leakage. This study proposes an auditing framework for conversational privacy that quantifies an agent's susceptibility to these risks. The proposed Conversational Manipulation for Privacy Leakage (CMPL) framework is designed to stress-test agents that enforce strict privacy directives against an iterative probing strategy. Rather than focusing solely on a single disclosure event or purely explicit leakage, CMPL simulates realistic multi-turn interactions to systematically uncover latent vulnerabilities. Our evaluation on diverse domains, data modalities, and safety configurations demonstrates the auditing framework's ability to reveal privacy risks that are not deterred by existing single-turn defenses, along with an in-depth longitudinal study of the temporal dynamics of leakage, strategies adopted by adaptive adversaries, and the evolution of adversarial beliefs about sensitive targets. In addition to introducing CMPL as a diagnostic tool, the paper delivers (1) an auditing procedure grounded in quantifiable risk metrics and (2) an open benchmark for evaluation of conversational privacy across agent implementations.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.10171

Country: North America > United States (1.00)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Can We Build AI That Does Not Harm Queer People?

Communications of the ACMApr-22-2025, 14:55:56 GMT

AI safety is a contentious topic. While some prominent figures of the AI community have argued that destructive general artificial intelligence (AI) is on the horizon, others derided their warning as a marketing stunt to sell large language models (LLMs). "If the call for'AI safety' is couched in terms of protecting humanity from rogue AIs, it very conveniently displaces accountability away from the corporations scaling harm in the name of profits," tweeted Emily Bender, a professor of computational linguistics at the University of Washington. Focusing on potential future harm from ever more powerful AI systems distracts from harm that is already happening today. Most of us do not set out to make software that is actively harmful.

large language model, natural language, queer identity term, (17 more...)

Communications of the ACM

Industry: Media (0.96)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.57)

Add feedback

Practical Application and Limitations of AI Certification Catalogues in the Light of the AI Act

Autischer, Gregor, Waxnegger, Kerstin, Kowald, Dominik

arXiv.org Artificial IntelligenceFeb-18-2025

In this work-in-progress, we investigate the certification of AI systems, focusing on the practical application and limitations of existing certification catalogues in the light of the AI Act by attempting to certify a publicly available AI system. We aim to evaluate how well current approaches work to effectively certify an AI system, and how publicly accessible AI systems, that might not be actively maintained or initially intended for certification, can be selected and used for a sample certification process. Our methodology involves leveraging the Fraunhofer AI Assessment Catalogue as a comprehensive tool to systematically assess an AI model's compliance with certification standards. We find that while the catalogue effectively structures the evaluation process, it can also be cumbersome and time-consuming to use. We observe the limitations of an AI system that has no active development team anymore and highlighted the importance of complete system documentation. Finally, we identify some limitations of the certification catalogues used and proposed ideas on how to streamline the certification process.

artificial intelligence, certification process, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.10398

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)
Law > Statutes (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.93)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
(2 more...)

Add feedback

Billion-dollar video game: is this the most expensive piece of entertainment ever made?

The GuardianJan-16-2025, 11:30:51 GMT

How much does it cost to make a video game? The development expenses of blockbuster games are closely guarded business secrets, but they have been climbing ever higher over the years towards big Hollywood-style spending. Industry leaks have exposed how the budgets of major video games are spiralling upwards: 100m, or 200m, even more. One of the bestselling franchises, Call of Duty, saw costs balloon to 700m ( 573m), a number only revealed recently when a reporter dug into court filings. There is, however, one game with a budget that is anything but secret.

expensive piece, star citizen, video game, (10 more...)

The Guardian

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Games (0.84)

Add feedback

Effective Monitoring of Online Decision-Making Algorithms in Digital Intervention Implementation

Trella, Anna L., Ghosh, Susobhan, Bonar, Erin E., Coughlin, Lara, Doshi-Velez, Finale, Guo, Yongyi, Hung, Pei-Yao, Nahum-Shani, Inbal, Shetty, Vivek, Walton, Maureen, Yan, Iris, Zhang, Kelly W., Murphy, Susan A.

arXiv.org Artificial IntelligenceAug-30-2024

Online AI decision-making algorithms are increasingly used by digital interventions to dynamically personalize treatment to individuals. These algorithms determine, in real-time, the delivery of treatment based on accruing data. The objective of this paper is to provide guidelines for enabling effective monitoring of online decision-making algorithms with the goal of (1) safeguarding individuals and (2) ensuring data quality. We elucidate guidelines and discuss our experience in monitoring online decision-making algorithms in two digital intervention clinical trials (Oralytics and MiWaves). Our guidelines include (1) developing fallback methods, pre-specified procedures executed when an issue occurs, and (2) identifying potential issues categorizing them by severity (red, yellow, and green). Across both trials, the monitoring systems detected real-time issues such as out-of-memory issues, database timeout, and failed communication with an external source. Fallback methods prevented participants from not receiving any treatment during the trial and also prevented the use of incorrect data in statistical analyses. These trials provide case studies for how health scientists can build monitoring systems for their digital intervention. Without these algorithm monitoring systems, critical issues would have gone undetected and unresolved. Instead, these monitoring systems safeguarded participants and ensured the quality of the resulting data for updating the intervention and facilitating scientific discovery. These monitoring guidelines and findings give digital intervention teams the confidence to include online decision-making algorithms in digital interventions.

algorithm, participant, rl algorithm, (16 more...)

arXiv.org Artificial Intelligence

2409.10526

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)

Add feedback

Roblox adds real-time AI chat translation using its own language model

EngadgetFeb-6-2024, 06:19:29 GMT

Currently serving over 70 million daily active users, Roblox is still going strong since its September 2006 launch -- almost 18 years ago. The development team is now taking one step further to boost the platform's massive community, by way of providing real-time AI chat translation to connect gamers around the world. According to CTO Daniel Sturman, his team needed to build their own "unified, transformer-based translation LLM (large language model)" in order to seamlessly handle all 16 languages supported on Roblox, as well as to recognize Roblox-specific slangs and abbreviations (this writer just learned that "obby" refers to an obstacle course in the game). As a result, the chat window always displays the conversation in the user's own tongue -- with a small latency of around 100 milliseconds, so it's pretty much real time. You can also click on the translation icon on the left of each line to see it in its original language.

language model, own language model, real-time ai chat translation, (4 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback